Efficient Mining of Intertransaction Association Rules
نویسندگان
چکیده
Most of the previous studies on mining association rules are on mining intratransaction associations, i.e., the associations among items within the same transaction where the notion of the transaction could be the items bought by the same customer, the events happened on the same day, etc. In this study, we break the barrier of transactions and extend the scope of mining association rules from traditional single-dimensional, intratransaction associations to multidimensional, intertransaction associations. An intertransaction association describes the association relationships among different transactions. In a database of stock price information, an example of such an association is “if (company) A’s stock goes up on day one, B’s stock will go down on day two but go up on day four.” In this case, no matter whether we treat company or day as the unit of transaction, the associated items belong to different transactions. Moreover, such an intertransaction association can be extended to associate multiple properties in the same rule, so that multidimensional intertransaction associations can also be defined and discovered. Mining intertransaction associations pose more challenges on efficient processing than mining intratransaction associations because the number of potential association rules becomes extremely large after the boundary of transactions is broken. In this study, we introduce the notion of intertransaction association rule, define its measurements: support and confidence, and develop an efficient algorithm, FITI (an acronym for “First Intra Then Inter”), for mining intertransaction associations, which adopts two major ideas: 1) an intertransaction frequent itemset contains only the frequent itemsets of its corresponding intratransaction counterpart; and 2) a special data structure is built among intratransaction frequent itemsets for efficient mining of intertransaction frequent itemsets. We compare FITI with EH-Apriori, the best algorithm in our previous proposal, and demonstrate a substantial performance gain of FITI over EH-Apriori. Further extensions of the method and its implications are also discussed in the paper.
منابع مشابه
Recognition of emergent human behaviour in a smart home: A data mining approach
Motivated by a growing need for intelligent housing to accommodate aging populations, we propose a novel application of intertransaction association rule (IAR) mining to detect anomalous behaviour in smart home occupants. An efficient mining algorithm that avoids the candidate generation bottleneck limiting the application of current IAR mining algorithms on smart home data sets is detailed. An...
متن کاملA new approach based on data envelopment analysis with double frontiers for ranking the discovered rules from data mining
Data envelopment analysis (DEA) is a relatively new data oriented approach to evaluate performance of a set of peer entities called decision-making units (DMUs) that convert multiple inputs into multiple outputs. Within a relative limited period, DEA has been converted into a strong quantitative and analytical tool to measure and evaluate performance. In an article written by Toloo et al. (2009...
متن کاملExtensions from Intratransaction to Intertransaction Associations
The discovery of association rules from large amounts of structured or semi-structured data is an important datamining problem (Agrawal et al., 1993; Agrawal & Srikant, 1994; Braga et al., 2002, 2003; Cong et al., 2002; Miyahara et al., 2001; Termier et al., 2002; Xiao et al., 2003). It has crucial applications in decision support and marketing strategy. The most prototypical application of ass...
متن کاملIdentifying and Evaluating Effective Factors in Green Supplier Selection using Association Rules Analysis
Nowadays companies measure suppliers on the basis of a variety of factors and criteria that affect the supplier's selection issue. This paper intended to identify the key effective criteria for selection of green suppliers through an efficient algorithm callediterative process mining or i-PM. Green data were collected first by reviewing the previous studies to identify various environmental cri...
متن کاملIntroducing an algorithm for use to hide sensitive association rules through perturb technique
Due to the rapid growth of data mining technology, obtaining private data on users through this technology becomes easier. Association Rules Mining is one of the data mining techniques to extract useful patterns in the form of association rules. One of the main problems in applying this technique on databases is the disclosure of sensitive data by endangering security and privacy. Hiding the as...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. Knowl. Data Eng.
دوره 15 شماره
صفحات -
تاریخ انتشار 2003